Skip to content

feat(network): use Retry-After header for HTTP 429 responses #15463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 7, 2025

Conversation

arlosi
Copy link
Contributor

@arlosi arlosi commented Apr 29, 2025

What does this PR try to resolve?

Cargo registries that return HTTP 429 when the service is overloaded expect the client to retry the request automatically after a delay. Cargo currently does not retry for HTTP 429.

What changed?

  • Adds HTTP 429 (too many requests) as a spurious HTTP error to enable retries.
  • Parse the Retry-After HTTP header to determine how long to wait before a retry.

In this implementation, the maximum delay is limited to Cargo's existing limit of 10 seconds. We could consider increasing that limit for this case, since the server is explicitly requesting the delay.

@rustbot
Copy link
Collaborator

rustbot commented Apr 29, 2025

r? @ehuss

rustbot has assigned @ehuss.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-networking Area: networking issues, curl, etc. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 29, 2025
Copy link
Member

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc #13530, which I believe this partially resolves it.

@@ -47,6 +47,7 @@ use anyhow::Error;
use rand::Rng;
use std::cmp::min;
use std::time::Duration;
use time::OffsetDateTime;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we are on the way of dropping time? #15293

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted to jiff

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe using jiff isn't the right call?

In my testing, it can take 2 to 10 seconds for jiff to initialize.

I suspect part of the problem is that jiff is configured to use the system timezone data, and it's probably loading and initializing a bunch of junk. Unfortunately because gix-date enables the default features, we can't easily disable that AFAIK.

Can we maybe get gix-date to not use the default? I don't fully understand what all the jiff features do, or when or if they are necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like jiff init is taking around that time in CI, which I agree is unacceptable. I'm not seeing the delay on Windows, but that's likely because jiff isn't using system timezone data there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's a way to do the parsing without loading the timezone data. The HTTP Date header is always GMT.

@arlosi
Copy link
Contributor Author

arlosi commented Apr 29, 2025

cc #13530, which I believe this partially resolves it.

This only applies for things that use Retry. IIRC the publishing logic doesn't have built-in retries at all.

@ehuss
Copy link
Contributor

ehuss commented Apr 29, 2025

Would it be possible to add a test using our HttpServer to ensure that this works going through curl? I imagine it would be something like adding a custom_responders?

@rustbot rustbot added the A-testing-cargo-itself Area: cargo's tests label May 5, 2025
Comment on lines 379 to 391
let expected = jiff::Zoned::now()
.until(
&jiff::civil::date(2100, 1, 1)
.at(0, 0, 0, 0)
.to_zoned(jiff::tz::TimeZone::UTC)
.unwrap(),
)
.unwrap()
.total(jiff::Unit::Millisecond)
.unwrap() as u64;
let actual = Retry::parse_retry_after(&headers).unwrap();
let diff = expected.abs_diff(actual.into());
assert!((diff < 1000), "{} != {} ({})", expected, actual, diff);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent of the problem with jiff initialization, I would recommend moving the expected initialization to after the actual line. That should reduce the time variance (since computing expected is very efficient, whereas parse_retry_after is not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll change parse_retry_after to take in now as a parameter to simplify the testing and eliminate the variation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction, I think I had that backwards. It looks like jiff::Zoned::now() is what is taking all the time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you try again with latest push and see if it's fixed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reported this upstream at BurntSushi/jiff#366.

Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, thanks!

@ehuss ehuss added this pull request to the merge queue May 7, 2025
Merged via the queue into rust-lang:master with commit a154422 May 7, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-networking Area: networking issues, curl, etc. A-testing-cargo-itself Area: cargo's tests S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants